rank | frequency | n-gram |
---|---|---|
1 | 44037 | -s |
2 | 39279 | -a |
3 | 31297 | -o |
4 | 20537 | -n |
5 | 18623 | -e |
rank | frequency | n-gram |
---|---|---|
1 | 14967 | -os |
2 | 12317 | -as |
3 | 10600 | -es |
4 | 7613 | -do |
5 | 5563 | -an |
rank | frequency | n-gram |
---|---|---|
1 | 3523 | -ado |
2 | 2957 | -ión |
3 | 2841 | -dos |
4 | 2833 | -mos |
5 | 2669 | -nte |
rank | frequency | n-gram |
---|---|---|
1 | 2481 | -ción |
2 | 2039 | -ados |
3 | 1906 | -ente |
4 | 1895 | -ando |
5 | 1652 | -adas |
rank | frequency | n-gram |
---|---|---|
1 | 1856 | -ación |
2 | 1228 | -mente |
3 | 1153 | -iones |
4 | 616 | -dores |
5 | 558 | -iento |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings